Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism
نویسندگان
چکیده
Die-stacked DRAM has been proposed for use as a large, high-bandwidth, last-level cache with hundreds or thousands of megabytes of capacity. Not all workloads (or phases) can productively utilize this much cache space, however. Unfortunately, the unused (or under-used) cache continues to consume power due to leakage in the peripheral circuitry and periodic DRAM refresh. Dynamically adjusting the available DRAM cache capacity could largely eliminate this energy overhead. However, the current proposed DRAM cache organization introduces new challenges for dynamic cache resizing. The organization diUers from a conventional SRAM cache organization because it places entire cache sets and their tags within a single bank to reduce on-chip area and power overhead. Hence, resizing a DRAM cache requires remapping sets from the powered-down banks to active banks. In this paper, we propose CRUNCH (Cache Resizing Using Native Consistent Hashing), a hardware data remapping scheme inspired by consistent hashing, an algorithm originally proposed to uniformly and dynamically distribute Internet trafVc across a changing population of web servers. CRUNCH provides a load-balanced remapping of data from the powereddown banks alone to the active banks, without requiring sets from all banks to be remapped, unlike naive schemes to achieve load balancing. CRUNCH remaps only sets from the powereddown banks, so it achieves this load balancing with low bank power-up/down transition latencies. CRUNCH’s combination of good load balancing and low transition latencies provides a substrate to enable eXcient DRAM cache resizing.
منابع مشابه
Addendum to “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches”
Abstract The MICRO 2011 paper “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches” proposed a novel die-stacked DRAM cache organization embedding the tags and data within the same physical DRAM row and then using compound access scheduling to manage the hit latency and a MissMap structure to make misses more efficient. This addendum provides a revised performan...
متن کاملDRAM Aware Last-Level-Cache Policies for Multi-core Systems
x latency DTC in two cycles. In contrast, state-of-the-art DRAM cache always reads the tags from DRAM cache that incurs high tag lookup latencies of up to 41 cycles. In summary, high DRAM cache hit latencies, increased inter-core interference, increased inter-core cache eviction, and the large application footprint of complex applications necessitates efficient policies in order to satisfy the ...
متن کاملMulti-Level Cache Resizing
Hardware designers are constantly looking for ways to squeeze waste out of architectures to achieve better power efficiency. Cache resizing is a technique that can remove wasteful power consumption in caches. The idea is to determine the minimum cache a program needs to run at near-peak performance, and then reconfigure the cache to implement this efficient capacity. While there has been signif...
متن کاملC3D: Mitigating the NUMA bottleneck via coherent DRAM caches
Massive datasets prevalent in scale-out, enterprise, and high-performance computing are driving a trend toward ever-larger memory capacities per node. To satisfy the memory demands and maximize performance per unit cost, today’s commodity HPC and server nodes tend to feature multi-socket shared memory NUMA organizations. An important problem in these designs is the high latency of accessing mem...
متن کاملMigrantStore: Leveraging Virtual Memory in DRAM-PCM Memory Architecture
With the imminent slowing down of DRAM scaling, Phase Change Memory (PCM) is emerging as a lead alternative for main memory technology. While PCM achieves low energy due to various technology-specific advantages, PCM is significantly slower than DRAM (especially for writes) and can endure far fewer writes before wearing out. Previous work has proposed to use a large, DRAM-based hardware cache t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1602.00722 شماره
صفحات -
تاریخ انتشار 2016